{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "resident-number",
   "metadata": {},
   "source": [
    "# 贝叶斯分类器"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "handy-palestine",
   "metadata": {},
   "source": [
    "贝叶斯决策论是概率框架下实施决策的基本方法，对于分类任务来说，在所有相关概率都已知的理想情况下，贝叶斯决策论考虑如何基于这些概率和误判损失来选择最优的类别标记。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "soviet-colon",
   "metadata": {},
   "source": [
    "假设有N种可能的类别标记，即$Y ={c_1,c_2,c_3...c_N},\\lambda_{ij} $是将一个真实标记为$ c_j $的样误分类为$ c_i $所产生的损失.基于后验概率P($c_i$|x)可以获得将样本x误分类成$c_i$所产生的的期望损失，记载样本x上的“条件风险”\n",
    "$$R(c_i|x)=\\displaystyle\\sum_{j=1}^N \\lambda_{ij}P(c_i|x)$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "oriented-radical",
   "metadata": {},
   "source": [
    "我们的任务是寻找一个判定准则h:X<->Y以最小化总体风险\n",
    "$$R(h)=E_x[R(h(x)|x)]$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "light-conspiracy",
   "metadata": {},
   "source": [
    "显然，对于每个样本x，若h能最小化条件风险R(h(x)|x)则总体风险R(h)也将被最小化，这就产生了贝叶斯判定准则：为最小化总体风险，只需在每个样本上选择那个能使条件风险R(c|x)最小的类别标记，即$$ h^*(x)= argmin_{c∈Y}R(c|x)$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "reserved-pendant",
   "metadata": {},
   "source": [
    "此时，为贝叶斯最优分类器，与之对应的总体风险$R(h^*)$称为贝叶斯风险，$1-R(h^*)$反映了分类器所能达到的最好性能，即通过机器学习所能产生的模型精度的理论上限。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "adopted-pottery",
   "metadata": {},
   "source": [
    "具体来说，若目标是最小化分类错误率，则误判损失$\\lambda_{ij}$可以表示为，当i=j时，$\\lambda_{ij}=0$，否则为1。\n",
    "此时条件风险为$$R(c|x)=1-P(c|x)$$,于是，最小化分类错误率的贝叶斯最优分类器为$$h^*(x)= argmax_{c∈Y}P(c|x)$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "occupied-survey",
   "metadata": {},
   "source": [
    "# 朴素贝叶斯分类器"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "southwest-method",
   "metadata": {},
   "source": [
    "朴素贝叶斯分类器采用了“属性条件独立性假设”：对已知类别，假设所有属性相互独立，换言之，假设每个属性独立地对分类结果发生影响。基于属性条件独立性假$$P(c|x)=\\frac{P(c)P(x|c)}{P(x)}=\\frac{P(c)}{P(x)}\\displaystyle\\prod_{i=1}^dP(x_i|c)$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "acknowledged-violin",
   "metadata": {},
   "source": [
    "由于对于所有类别来说P(x)相同，贝叶斯判定准则有\n",
    "$$h_{nb}(x)=argmax_{c∈Y}P(c)\\displaystyle\\prod_{i=1}^dP(c|x)$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "smooth-industry",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
